MMCI at the TREC 2010 Web Track

نویسندگان

  • Andreas Broschart
  • Ralf Schenkel
چکیده

Term proximity scoring models incorporate distance information of query term occurrences and are an established means in information retrieval to improve retrieval quality. The integration of such proximity scoring models into efficient query processing, however, has not been equally well studied. Existing methods make use of precomputed lists of documents where tuples of terms, usually pairs, occur together, usually incurring a huge index size compared to term-only indexes. This paper uses a joint framework for trading off index size and result quality. The framework provides optimization techniques for tuning precomputed indexes towards either maximal result quality or maximal query processing performance under controlled result quality, given an upper bound for the index size.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Microsoft Research at TREC 2011 Web Track

This paper describes our entry into the TREC 2011 Web track. We extracted and ranked results from the ClueWeb09 corpus using a parallel processing pipeline that avoids the generation of an inverted file. We describe the components of the parallel architecture and the pipeline, how we ran the TREC experiments, and we present effectiveness results.

متن کامل

The Role of Anchor Text in ClueWeb09 Retrieval

This report describes the work done at The University of Melbourne with the ClueWeb09 data corpus for the Web Track of TREC-2009 and TREC-2010, and for the Session Track of TREC-2010. We found that the impact-based retrieval model works well for the corpus, and that, along with some other factors, the use of an anchor text collection significantly boosts the retrieval effectiveness.

متن کامل

University of Essex at the TREC 2010 Session Track

This paper provides an overview of the experiments we carried out at the TREC 2010 Session Track. We propose an approach for interpreting reformulated queries by using query expansions derived from anchor logs which we envisage to be a potential alternative to query logs. We show that expansion with terms or phrases extracted from anchor logs improves the retrieval performance over a search ses...

متن کامل

Overview of the TREC 2010 Legal Track Notebook Draft 2010 . 10 . 25

The TREC 2010 Legal Track consisted of two distinct tasks: the learning task, in which participants were required to estimate the probability of relevance for each document, and the interactive task, in which participants were required to identify all relevant documents using a human-in-the-loop process. 2010 is the fth year of the legal track, the third year of the interactive task within the ...

متن کامل

ClueWeb09 and TREC Diversity

The TREC Web Track explores and evaluates Web retrieval technologies. The TREC 2009 Web Track included both a traditional adhoc retrieval task and a new diversity task. The goal of this diversity task is to return a ranked list of pages that together provide complete coverage for a query, while avoiding excessive redundancy in the result list. Both tasks will continue at TREC 2010, which will a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010